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[i] Single-column models (SCM) are useful test beds for investigating the 
parameterization schemes of numerical weather prediction and climate models. The 
usefulness of SCM simulations are limited, however, by the accuracy of the best estimate 
large-scale observations prescribed. Errors estimating the observations will result in 
uncertainty in modeled simulations. One method to address the modeled uncertainty is to 
simulate an ensemble where the ensemble members span observational uncertainty. This 
study first derives an ensemble of large-scale data for the Tropical Warm Pool 
International Cloud Experiment (TWP-ICE) based on an estimate of a possible source of 
error in the best estimate product. These data are then used to carry out simulations with 
1 1 SCM and two cloud-resolving models (CRM). Best estimate simulations are also 
performed. All models show that moisture-related variables are close to observations and 
there are limited differences between the best estimate and ensemble mean values. The 
models, however, show different sensitivities to changes in the forcing particularly when 
weakly forced. The ensemble simulations highlight important differences in the surface 
evaporation term of the moisture budget between the SCM and CRM. Differences are 
also apparent between the models in the ensemble mean vertical structure of cloud 
variables, while for each model, cloud properties are relatively insensitive to forcing. The 
ensemble is further used to investigate cloud variables and precipitation and identifies 
differences between CRM and SCM particularly for relationships involving ice. This 
study highlights the additional analysis that can be performed using ensemble simulations 
and hence enables a more complete model investigation compared to using the more 
traditional single best estimate simulation only. 
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1. Introduction 

[ 2 ] The Tropical Warm Pool International Cloud Experi- 
ment (TWP-ICE) took place around Darwin from 20 January 
to 13 February 2006 [May et al., 2008]. The data collected 
during the experiment provides an opportunity to inves- 
tigate several different states of tropical convection. The 
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experiment collected sufficient information to derive both 
the large-scale heat and momentum and moisture budgets 
[Xie et al, 2004] as well as detailed information on the 
state of the smaller scale convection and associated clouds. 
Such data sets are commonly used in the modeling com- 
munity to carry out process-oriented studies in particular 
applying cloud-resolving models (CRM) and single-column 
models (SCM). One of the primary motivations for TWP- 
ICE was to enable the improvement of global climate models 
(GCM), which are known to be deficient in the repre- 
sentation of cloud and rainfall particularly associated with 
tropical convection. The international research community 
has conducted a suite of multimodel studies for TWP-ICE. 
A hierarchy of experiments enables the investigation of 
model errors as discussed in J. Petch et al. (Evaluation 
of intercomparisons of four different types of model sim- 
ulating TWP-ICE, submitted to Quarterly Journal of the 
Royal Meteorological Society, 2012) and includes GCM [Lin 
et al., 2012] and Limited Area Models [Zhu et al., 2012] 
forced with European Centre for Medium-Range Weather 
Forecasts (ECMWF) reanalysis as well as a CRM study 
[Fridlind et al., 2012] performing simulations driven by a 
single “best estimate” large-scale budget data set [Xie et al., 
2010]. This paper reports on the SCM component of the 
overall modeling strategy. One innovation applied here will 
be the use of an ensemble of SCM simulations to elucidate 
uncertainties in the estimation of model errors and to explore 
model sensitivities to changes in the data set driving the 
model simulations. 

[ 3 ] The investigation of model shortcomings through 
SCMs is a well-used method in the model development 
research community. Model development studies, which 
include a SCM component, have been instigated by the 
Global Energy and Water-Cycle Experiment (GEWEX) 
Cloud System Study (GCSS) [Randall et al., 2003] in con- 
junction with the U.S. Department of Energy Atmospheric 
System Research (ASR) to investigate a wide range of test 
cases including deep convection over the tropical ocean 
using data from the Tropical Ocean Global Atmosphere 
(TOGA) Coupled Ocean- Atmosphere Response Experiment 
(COARE) [Webster and Lukas, 1992] intensive observation 
period [e.g., Woolnough et al., 2010; Bechtold et al., 2000] 
and convection over land exploiting extensive observations 
[e.g., Grabowski et al., 2006; Xie et al, 2005, 2002; Ghan et 
al., 2000]. Investigation of the specific problem of the diur- 
nal cycle was conducted by the European Cloud Systems 
(EUROCS) project and discussed in Guichard et al. [2004]. 
These studies focussed on a limited number of model sim- 
ulations forced by a single data set, from hereon referred to 
as the “best estimate” forcing. While best estimates of the 
large-scale atmosphere are usually derived to depict the most 
probable state of the large-scale atmosphere, they do contain 
errors of usually unknown magnitude. These errors compli- 
cate the interpretation of the results of SCM simulations, 
as the discrepancies between the model-simulated fields and 
observations may be attributed to two sources, from pre- 
scribing an incorrect large-scale state or due to errors in 
model processes. By using a single-model realization of 
the large-scale state, it is impossible to separate these two 
error sources. 

[ 4 ] Ensemble techniques are commonly used in numerical 
weather prediction (NWP) and climate models to investigate 


model sensitivities and to determine uncertainty. These 
ensembles may include perturbed initial conditions or vary- 
ing model parameters within a limited range. Multimodel 
ensembles have also been used to provide an estimate of 
the range of simulations. A limited number of studies also 
derived ensemble techniques for use in SCM studies. Hack 
and Pedretti [2000] added random perturbations to the initial 
conditions of their ensemble simulations and found consid- 
erable variations in simulated fields. Similar results were 
found when modifying the prescribed vertical motion field in 
a similar manner. Given the bifurcations discussed in Hack 
and Pedretti [2000], Hume and Jakob [2005, 2007] and Ball 
and Plant [2008] determined that an ensemble technique was 
appropriate for SCM. Hume and Jakob [2005] found that 
after about 18 h of simulation, results were increasingly sen- 
sitive to the prescribed forcing rather than differences in the 
initial conditions. For this reason, this TWP-ICE study uses 
an ensemble of large-scale forcing. 

[ 5 ] The goal of this study is to apply an ensemble SCM 
technique to the TWP-ICE experiment and to highlight 
additional opportunities for model evaluation that such a 
technique may provide. The technique is applied to a wide 
range of SCMs as well as a small number of CRMs, enabling 
the investigation of a range of model behaviors. The results 
from the ensemble simulation will be compared to those of 
single “best estimate” simulations. It will be shown that a 
particularly interesting aspect of the use of the ensemble 
technique in this context is the possibility to study model 
sensitivities with changing forcing data set. It is shown that 
the different models exhibit distinctly different ensemble 
behavior that is not apparent when comparing simulations 
with a single forcing data set. Section 2 summarizes the 
experimental design including the methodology used in the 
derivation of the ensemble large-scale forcing, the case spec- 
ification, and a description of the models. The main results of 
the study are discussed in section 3 followed by a summary 
and the main conclusions in section 4. 


2. Experimental Design 

[e] The experiments conducted here use both a best esti- 
mate forcing and an ensemble of forcing data sets. The best 
estimate data set used is that derived by Xie et al. [2010] and 
is identical to that used in Fridlind et al. [2012], Using an 
ensemble approach enables a better understanding of model 
accuracy and model sensitivity to be calculated. As this 
study includes a number of different models, these character- 
istics are determined dependent on model. This section will 
detail the design of the study including the ensemble forcing 
design, the case specification, and the participating models. 

2.1. Ensemble Design 

[ 7 ] A number of techniques exist to derive budgets from 
observational data collected in field campaigns. Here, the 
variational analysis technique of Zhang and Lin [1997] is 
used in the analysis of TWP-ICE observations. This tech- 
nique provides an estimate of area-averaged atmospheric 
and surface conditions using a combination of surface obser- 
vations, vertical profiles of the atmosphere, satellite obser- 
vations, and numerical model data. The variational analysis 
process minimizes a cost function for the heat, moisture, and 
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Figure 1. Time-averaged vertical profiles of omega over active (left) and suppressed (right) monsoon 
for all ensemble members. Broken and light-colored lines show all ensemble members with key ensemble 
members (5th, 25th, 50th, 75th, and 95th percentiles) as black continuous lines. The best estimate forcing, 
used here and in the CRM intercomparison, is shown by small circles. Note the different x axis. 


momentum budgets using constraints of top of atmosphere 
and surface energy and moisture. 

[8] One of the constraints used in the variational analysis 
method is the domain-average surface rainfall. In the case of 
the TWP-ICE experiment, this domain-mean surface rainfall 
is derived from radar data. Compared to the use of rain gauge 
observations, this improves the spatial representativeness of 
the estimate, but this comes at the expense of accuracy of 
the local rainfall estimates as radar measurements need to be 
converted to rainfall. It has been shown [Zhang et al, 2001] 
that the surface rainfall has a large effect on the derived 
forcing data set; for example, the analyzed vertical veloc- 
ity is very sensitive to rainfall. Furthermore, the derivation 
of surface rainfall from radar data is also highly complex 
and liable to large errors [Joss and Waldvogel, 1990]. These 
errors will have a large effect on the derived forcing data set. 

[ 9 ] One method to address uncertainty in large-scale forc- 
ing data is to derive an ensemble of forcing data. Only a 
short summary of the method to derive such an ensemble is 
given here, with more details provided in the Appendix. The 
method is principally based on estimates of errors in the rain- 
fall estimates that are a key input to the variational budget 
analysis. A comparison of radar-derived and rain gauge data 
is carried out to provide an estimate of the error in the radar 
estimates of domain-average rainfall. From these error esti- 
mates, 100 equally likely alternative domain-mean rainfall 
time series are calculated. The 100 rainfall time series are 
then used as inputs to the variational analysis to derive 100 
alternative versions of the large-scale state using the same 
variational technique as is used to derive the best estimate 
large-scale state. These 100 large-scale states constitute the 
forcing ensemble used in this study. 

[ 10 ] When deriving the large-scale state using these alter- 
nate rainfall time series, all other observations have the 
same values as the best estimate, for example, tempera- 
ture, moisture, and horizontal wind fields. Given that the 
boundary values of temperature and moisture are identical 
between all realizations, the horizontal advection terms of 
temperature and moisture differ very little. The variational 
analysis process generally equates larger values of rain- 
fall with increased low-level convergence and upper level 
divergence and therefore generally larger values of vertical 


velocity. The structure of the derived vertical velocity, how- 
ever, is also dependent on other budget terms so that vertical 
velocity does not monotonically increase with rainfall. 

[ 11 ] Figure 1 shows the vertical velocity profile aver- 
aged over both the active and suppressed monsoon for each 
ensemble member as well as key percentiles of the ensem- 
ble. Stronger vertical motion is derived from time series 
with larger rainfall. In the active monsoon, there is always 
strong upward vertical motion, although the ensemble mem- 
bers with weaker rainfall have weaker vertical motion. 
During the suppressed monsoon, the ensemble members 
with strong rainfall have upward vertical motion at all 
levels. The ensemble members with weaker rainfall have 
upward motion at lower levels (below 650 hPa) but down- 
ward motion above. In addition to the ensemble members, 
Figure 1 shows the standard best estimate for vertical veloc- 
ity. As is evident, the best estimate results are close, but 
not identical, to the 50th percentile of the ensemble forc- 
ing. While there is a large spread in omega, it is worth 
noting that 50% of the ensemble members lie in the lim- 
ited range between the 25th and 75th percentile lines. While 
each ensemble member is equally likely, most cluster around 
the 50th ensemble member and the best estimate, and the 
most extreme omega values are rare. These differences in 
omega imply changes in low-level convergence and upper 
level divergence through the continuity equation and will 
have an effect on convection. These 100 large-scale “forc- 
ing” data sets are then used as input to the model simulations 
discussed below. 

2.2. Case Description 

[ 12 ] The TWP-ICE experiment experienced a range of 
atmospheric conditions. At the start of the experiment, the 
region experienced monsoon conditions. Between 23 and 
24 January, a strong mesoscale convective system (MCS) 
passed through the domain followed by relatively sup- 
pressed conditions. There were then clear conditions from 
3 to 5 February with little rain followed by monsoon break 
conditions to the end of the field campaign. Full details of 
the meteorological conditions can be found in May et al. 
[2008]. In this study, the focus is on the active period defined 
as 20 00Z-25 12Z Jan and the suppressed period defined as 
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Table 1 . Models Contributing to SCM Study 3 


Model 

Full Name 

Modeler 

Reference 

UM-GR 

Unified Model-Gregory and Rowntree 

M. Whitall/R. Plant 

Davies et al. [2005] 

UM-PC 

Unified Model-Plant/Craig 

R. Keane/R. Plant 

Davies et al. [2005] 

SCAMS 

Single-Column Community Atmospheric Model 

X. Liu/X. Shi 

Collins et al. [2006] 

SCAML 

Single-Column Community Atmospheric Model 

X. Liu/X. Shi 

Wang et al. [2009] 

SCAMR 

Single-Column Community Atmospheric Model 

X. Song/G. Zhang 

Collins et al. [2006] 

NCEPG 

NCEP GFS Model 

W. Wang 

EMC [2003] 

GFDL2 

GFDL-AM2 Model 

Y. Lin 

GAMDT [2004] 

GISS 

GISS Model 

A. Wolf/A. DelGenio 

Schmidt et al. [2006] 

CLUBB 

Cloud Layers Unified by Binormals model 

B. Nielsen/V. Larson 

Golaz et al. [2002] 

JMA1 

Japan Meteorological Agency 

T. Komori 

JMA [2007]; Nakagawa [2009] 

JMA2 

Japan Meteorological Agency 

T. Komori 

JMA [2007]; Nakagawa [2009] 

2-D LEM 

UK Met Office Large Eddy Model 

A. Hill 

Gray et al. [200 1 ] 

3-D SAM 

System for Atmospheric Modeling 

L. Davies 

Khairoutdinov and Randall [2003] 


“Includes the acronym used in this paper, the full model name, contributing author(s), and the main reference for the model. 
Further model details are given in the text and the references therein. Note that there are two cloud-resolving models as part of this 
study. 


28 00Z Jan-2 12Z Feb. The conditions during the clear and 
break periods are dominated by a strong diurnal cycle, which 
is driven by the land-sea contrast in the experiment domain. 
As SCMs cannot usually represent such contrasts in a single 
grid box, the later part of the experiment is excluded from 
the simulations presented here. 

[ 13 ] In order to investigate the performance of the ensem- 
ble technique proposed here in different meteorological 
conditions, the study applies two sets of large-scale forcing 
data. The first is a best estimate simulation forced using the 
standard data set [Xie et al., 2010]. These simulations can be 
directly compared to the CRM results [Fridlind et al., 2012], 
and the best estimate simulations also form the basis of dis- 
cussion in J. Petch et al. (submitted manuscript, 2012). In 
this study, the best estimate simulations will be used to form 
a SCM multimodel ensemble. In addition to the best esti- 
mate simulations, all models were run using the 100-member 
ensemble of forcing data derived above. It was found that 
some models showed numerical instabilities for the strongest 
forcing data sets (i.e., those derived from the largest rainfall) 
when using their standard time-stepping. As a consequence, 
the 10 strongest forcing data sets and, for reasons of symme- 
try, the 10 weakest ones are excluded from further analysis, 
reducing the ensemble size to 80. 

[ 14 ] The aim when defining the model specification is to 
impinge as little as possible on the inherent characteristics 
of the individual models, and modelers are encouraged to 
use their preferred configurations; however, the following 
requirements are made for all simulations: 

[15] 1. The TWP-ICE domain has mixed surface types 
making the choice of surface type unclear. All simulations 
assume an ocean surface consistent with Fridlind et al. 
[2012]. Fixed time-invariant SST = 29°C is used. Interactive 
surface fluxes are required to be calculated in the boundary 
layer scheme. 

[16] 2. Simulations are initialized with observed tempera- 
ture and moisture profiles at 0300Z 19 January 2006. 

[ 17 ] 3. An observed ozone profile is used where possi- 
ble, but the McClatchey ozone profile [McClatchey et al., 
1972] is used above the maximum height of observations 
(40 mbar). 

[is] 4. Full interactive radiation is used with a diurnal 
cycle for a domain centered on the Atmospheric Radiation 
Measurement (ARM) site (12.425°S, 130.891°E). 


[ 19 ] 5. Mean horizontal winds are relaxed to observed 
profiles with a 2 h time scale. There is no nudging of the tem- 
perature and moisture fields which are left free to respond to 
the forcing. 

[ 20 ] 6. Horizontal advective tendencies for temperature 
and moisture are prescribed, but the vertical terms are calcu- 
lated by the models. Sensitivity studies showed a warm tem- 
perature bias above the tropopause when prescribing a total 
forcing as the model cannot freely evolve vertical advection 
associated with this warming and reduce the temperature 
bias. This method differs from Fridlind et al. [2012] where 
temperature and moisture were nudged towards observed 
profiles to avoid such temperature biases. 

2.3. Models 

[ 21 ] In this section, a brief description of all models used 
in this study will be given. Table 1 gives a summary of 
the models with further details given below. The study also 
includes two CRM which also simulate the ensemble. The 
CRM provide an important reference for the SCM and link 
to the CRM study [. Fridlind et al., 2012]. 

[ 22 ] The UK Met Office SCM [. Davies et al. , 2005] con- 
tains parameterizations for radiation [Edwards and Slingo, 
1996], layer-cloud microphysics [ Wilson and Forbes, 2004; 
Wilson and Ballard, 1999], boundary layer processes [Lock 
et al., 2000], and convection; see also Martin et al. [2006]. 
Results were submitted for both the default UM convection 
scheme [Gregory and Rowntree, 1990; Martin et al., 2006; 
Derbyshire et al., 2011] (UM-GR) and the Plant and Craig 
[2008] stochastic spectral mass-flux scheme (UM-PC). In 
the default scheme, convection is triggered by instability 
of surface parcels at the lifting condensation level (LCL); 
a CAPE closure is used for deep convection, and the clo- 
sure for shallow convection is based on Grant [2001]. In 
the Plant-Craig scheme, convection is triggered by con- 
structing potential updraft source layers and evaluating their 
buoyancy at the LCL; a CAPE closure is used. The stochas- 
tic variability of the Plant-Craig scheme depends upon the 
column size — an area of (50 km) 2 was used here. 

[ 23 ] The single-column model of the NCAR CAM3 
(SCAM) contains the radiation scheme as described in 
Collins et al. [2006]. The treatment of cloud condensa- 
tion and microphysics in CAM3 [Boville et al., 2006] is 
based on Rasch and Kristjdnsson [1998] as updated by 
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Zhang et al. [2003] with separate prognostic equations for 
the liquid and ice-phase condensate. The boundary layer 
scheme is based on Holtslag and Boville [1993] and Boville 
et al. [2006]. CAM3 includes the convection scheme of 
Zhang and McFarlane [1995] with CAPE closure. CAM3- 
Liu (SCAML) [Wang et al, 2009] differs from SCAM 
with modification for cloud microphysics by introducing 
a double-moment cloud microphysics [Liu et al., 2007], 
explicit treatment of ice nucleation [Liu and Penner, 2005], 
and water vapor deposition on ice crystals and Bergeron- 
Findeisen process in pure ice and mixed-phase clouds. 
SCAMR differs most fundamentally from SCAMS as the 
deep convection parameterization is replaced by the revised 
Zhang and McFarlane [1995] scheme proposed by Zhang 
[2002]. The new convection scheme uses CAPE changes 
due to large-scale forcing (e.g., large-scale advection, radia- 
tive cooling) in the free troposphere, instead of CAPE itself, 
for closure. 

[ 24 ] In the NCEP GFS model, the longwave radia- 
tion scheme follows Pels and Schwarzkopf [1975] and 
Schwarzkopf and Pels [1991]. The shortwave radiation for- 
mulation uses multiband techniques [Slingo, 1989; Chou et 
al, 1998; Kiehl et al., 1998]. The cloud condensate is prog- 
nosed from a single-moment microphysics scheme [Zhao 
and Carr, 1997]. The boundary layer parameterization uses 
a nonlocal scheme [Hong and Pan, 1996]. Penetrative con- 
vection scheme [Pan and Wu, 1995] is simplified from 
Arakawa and Schubert [1974], with a quasi-equilibrium 
assumption as a closure. Convection is trigged when a 
cloud work function exceeds a threshold. Shallow convec- 
tion is parameterized as an extension of the vertical diffusion 
scheme [Tiedtke, 1983]. 

[ 25 ] The GFDL AM2 uses the shortwave radiation algo- 
rithm of Freidenreich and Ramaswamy [1999], and the 
longwave radiation follows Schwarzkopf and Ramaswamy 
[1999]. It uses Slingo [1989] and Held et al. [1993] for liq- 
uid cloud radiative properties and Fu and Liou [1993] for 
ice clouds. The microphysics scheme uses Rotstayn [1979] 
with cloud fraction prognosed following Tiedtke [1993]. The 
microphysics used for convective clouds is rather crude 
with prescribed precipitation efficiencies for shallow and 
deep convections. Boundary layer scheme follows Lock 
et al. [2000]. GFDL uses the relaxed Arakawa-Schubert 
scheme [Moorthi and Suarez, 1992] with a CAPE closure 
for shallow and deep convection. 

[ 26 ] The GISS SCM used here is a developmental 
update of the Schmidt et al. [2006] model. Radiation 
uses explicit multiple scattering calculations and the k- 
distribution approach to absorption. Large-scale clouds are 
based on the prognostic cloud water parameterization of Del 
Genio et al. [1996], including all relevant microphysical pro- 
cesses, detrainment, and cloud top entrainment. Convective 
microphysics follows Del Genio et al. [2005], which inter- 
actively partitions condensate into precipitating, detrained, 
and vertically advected components. The boundary layer 
uses dry conserved variables and includes local (diffu- 
sive) and counter-gradient flux terms. Moist convection is 
parameterized using a mass-flux scheme with convection 
triggered when a lifted parcel becomes buoyant. The mass 
flux is that required to produce neutral buoyancy at cloud 
base, with updraft speeds and entrainment rates based on 
Gregory [2001]. 


[ 27 ] The CLUBB model, in these TWP-ICE simulations, 
is used in conjunction with the BUGSrad radiative transfer 
scheme [Stephens et al, 2001] and a single microphysics 
scheme [Morrison et al., 2009] for all clouds. Although in 
the prior literature CLUBB was tested only for boundary 
layer cloud cases [Golaz et al, 2002; Larson and Golaz, 
2005; Larson et al., 2012], here CLUBB is used to sim- 
ulate both deep and shallow clouds with a single, unified 
equation set. Unlike Larson et al. [2012], here CLUBB is 
run as a single-column model and handles all cloud types 
without the use of a cloud-resolving model or any other host 
model. CLUBB prognoses various higher-order moments 
and achieves closure by use of a single multivariate subgrid 
PDF of velocity, moisture, and temperature. CLUBB has no 
explicit convective trigger; rather, the turbulence and ther- 
modynamic variability generated in shallow convection are 
intended to evolve into deep convection when and where the 
large-scale forcings are appropriate. 

[ 28 ] The single-column model JMA1 contains the param- 
eterizations of the default Global Spectral Model [JMA, 
2007; Nakagawa, 2009]. The radiation scheme has two- 
stream with delta-Eddington approximation for shortwave 
and table look-up and k-distribution methods for long- 
wave. Cloud condensation and microphysics are based on 
Smith [1990] and Sundqvist et al. [1989]. The boundary 
layer scheme is the level 2 closure scheme of Mellor and 
Yamada [1974]. The convection scheme is a multiplume type 
with cloud work function closure based on Arakawa and 
Schubert [1974], two types (for shallow and deep con- 
vection) of prognostic equations of the upward mass-flux 
[Randall and Pan, 1993] and triggering functions [Xie and 
Zhang, 2000]. JMA2 is the same as JMA1, except for using 
modified convection and cloud schemes (T. Komori and 
K. Yoshimoto, Evaluation from a perspective of spin-down 
problem: Moistening effect of convective parameterization, 
submitted to CAS/JSC WGNE Research Activities in Atmo- 
spheric and Oceanic Modeling, 2012). 

[ 29 ] There are two CRM in the study which are briefly 
described here. The UKMO Large Eddy simulation model 
(LEM) uses the shortwave and longwave radiation scheme 
of Edwards and Slingo [1996]. The LEM employs a three- 
phase microphysics scheme, which is described in Gray 
et al. [2001], and the microphysical configuration is the 
same as the UKMO-2A setup described in Fridlind et al 
[2012]. The sub grid mixing scheme is a modified first-order 
Smagorinksky-Lilly scheme, which is described in MacVean 
and Mason [1990]. 

[ 30 ] The model used in the System for Atmospheric Mod- 
eling (SAM) is described by Khairoutdinov and Randall 
[2003] and uses the BUGSrad radiation scheme described 
in Stephens et al. [2001]. Single-moment microphysics were 
used as outlined in Khairoutdinov and Randall [2003]. 
The subgrid mixing scheme is a 1.5-order closure model 
[Khairoutdinov and Kogan, 1999]. The SAM model simu- 
lates nine ensemble members equally spaced in the range 
10-90. 

3. Results 

3.1. Simulations of Humidity and Precipitation 

[ 31 ] This section gives an overview of both the tempo- 
ral evolution and the vertical structure of the simulation 
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Figure 2. Time series of precipitation for the active period (left) and suppressed period (right) for each 
model type. Colored lines show the average for each model type (e.g., all UM SCM, all SCAM, and all 
JMA models are averaged together) and gray lines the 80-member ensemble for all models. The best 
estimate observed precipitation is given in the heavy black line. Note the different y axis. 


of several moisture-related variables in the various models. 
Particular focus is given to comparing moisture-related vari- 
ables as large errors can arise in models potentially due to the 
dependence of moisture on error-prone parameterizations. 
The convective component of total surface precipitation is 
discussed to highlight the different roles of model parame- 
terization between the active and suppressed periods. Model 
accuracy will be discussed by comparison to observations 
for each model. The ensemble is then used to investigate 
model sensitivity in terms in the sources and sinks in the 
moisture budget. The best estimate is contrasted with the 
ensemble mean to directly determine how using an aver- 
age of many simulations might affect results compared to a 
single simulation. 


3.1.1. Overall Simulation Behavior 

[ 32 ] Figure 2 shows time series of surface precipitation for 
the active and suppressed periods. Model ensemble means 
are shown as colored lines with individual ensemble mem- 
bers from all model simulations overlaid in gray. In this 
figure, all UM-type, SCAM-type, and JMA-type models are 
averaged together as they are very similar. Observations are 
shown as a heavy black line. This plot allows broad interpre- 
tation of the characteristics of each model while capturing 
the spread of the ensemble. Figure 2 shows that all mod- 
els have a similar precipitation during the active period with 
moderate precipitation before the passage of the MCS on 
23-24 January. All models have similar heavy rain associ- 
ated with the MCS. The ensemble is spread around this mean 



MM BE UM-PC UM-GR SCAMS SCAML SCAMR NCEPG GFDL2 GISS CLUBB JMA1 JMA2 2D LEM 3D SAM OBS 

Model 


Figure 3. Mean precipitation averaged over the active and suppressed periods. The left box is the mul- 
timodel ensemble constructed from the best estimate simulations (MM BE) averaged over the period for 
each model. There are nine individual ensemble SCM with 80 members and two ensemble CRMs, a 2-D 
Met Office LEM simulation with 80 members, and a 3-D SAM model with 9 members. The far right has 
the ensemble of observations. The box represents the 25th, 50th, and 75th percentiles with the 5th and 
95th percentiles being shown by the horizontal bars. The ensemble mean data are shown by the small 
asterisk. The best estimate is shown for the SCM data and observations as large asterisks. The ensemble 
is averaged for each ensemble member over all times. 
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with the largest spread occurring during the MCS. Modeled 
precipitation is close to observations which may be antici- 
pated, as in strongly forced conditions precipitation will be 
predominantly driven by forcing in all ensemble members 
[Xie et al., 2005; Xu et al., 2002; Woolnough et al., 2010]. 

[ 33 ] Period-mean precipitation during the suppressed 
period is lower than during the active period. It is evi- 
dent that the relative differences in the ensemble mean time 
evolution between models as well as the differences from 
observations are larger than those in the active period. This 
might be expected as the forcing is weaker and as a conse- 
quence has less of an influence on the model solutions. In 
weakly forced conditions, it is expected that the details of 
the parameterizations in the various models exert a stronger 
influence, which explains the larger differences in the sup- 
pressed period. The ensemble spread is rather uniform and 
does not increase substantially with rainfall, which remains 
light throughout the period. The CRM behave similarly to 
the SCM. In the active period, solutions from the two model 
types track each other closely, again highlighting that pre- 
cipitation is constrained by the forcing in that period. Just as 
for the SCM, the differences between CRMs as well as to 
observations increase (in a relative sense) in the suppressed 
period. The CRM results in the active period strongly resem- 
ble the results of the larger CRM comparison \Fridlind et al., 
2012], indicating that the CRMs shown here provide a 
representative sample for this family of models. 

[ 34 ] Figure 3 provides a comparison of the multimodel 
best estimate ensemble and individual model ensembles for 
the time-mean surface precipitation averaged over the active 
and suppressed periods for all simulations used in this study. 
Each model is included as a box-whisker plot constructed 
from the time-averaged precipitation for each ensemble 
member. Observations are also included. It can clearly be 
seen that the ensemble SCM and CRM encompass a wide 
range of surface precipitation values. The models capture 
the spread seen in the observations. This is due to strong 
coupling between the forcing, which is primarily through 
vertical velocity, and rainfall. 

[ 35 ] The multimodel ensemble has a limited spread of 
surface precipitation as all models are simulating the same 
forcing. Figure 3 provides a useful check that the multimodel 
ensemble has limited spread compared to the SCM and 
CRM simulations. This result supports findings of Hume and 
Jakob [2005] that largest spread in an SCM ensemble will 
be found by varying the forcing (the boundary conditions). 
Figure 3 also shows the ensemble mean (small asterisk) 
and best estimate mean (large asterisk) precipitation for the 
observations and models. For most models, the magnitude 
of the best estimate observed precipitation is very close to 
the 50th percentile (median) precipitation with the ensemble 
mean larger. This is due to the ensemble having a distribu- 
tion which is skewed towards high values of precipitation 
leading to larger means than medians. 

[36] Figure 4 shows time -height cross sections of the 
observed, SCM-mean and CRM-mean modeled relative 
humidity. Relative humidity provides a useful perspective 
on the model simulations, since unlike precipitation, which 
is primarily driven by forcing, relative humidity is less con- 
strained by the forcing and more affected by model physics 
[. Emanuel and Zivkovic-Rothman, 1999]. Given the model 
setup (section 2.2), the models have freedom to develop 
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Figure 4. Time-pressure relative humidity (with respect to 
water) for the active and suppressed periods for observa- 
tions and SCM and CRM simulations. The SCM and CRM 
data are averaged over all models and all available ensemble 
members in the range 10-90. 
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certain moisture source/sink terms such as moisture con- 
vergence and surface evaporation. The ensemble sensitivity 
to these terms will be addressed in section 3.1.3. Relative 
humidity with respect to water has been calculated using 
Teten’s formula [Lowe, 1977, equation 6] for each individ- 
ual simulation from values of temperature, water vapor, and 
pressure to ensure consistency across models. The modeled 
data shown in F igure 4 is an average over all models and all 
ensemble members used. Detailed investigation shows that 
relative humidity differences are primarily caused by dif- 
ferences in moisture as temperature varies little across the 
simulations and is close to observations. 

[ 37 ] Observations show that the atmosphere has high rel- 
ative humidity through a deep layer during the active period, 
but the models generally underestimate humidity particu- 
larly at low levels. During the suppressed period, observa- 
tions show lower humidity above 800 hPa but large values 
in the boundary layer. All models capture the reduction in 
relative humidity caused by drying on the transition to the 
suppressed period above 700 hPa, although the SCM over- 
estimate the reduction in humidity. Both SCM and CRM 
persist with low values of humidity in the boundary layer 
compared to observations. 

[ 38 ] While Figure 4 shows the evolution of the mean state, 
the ensemble simulations also allow investigation of model 
sensitivity. Figure 5 shows time series of 500 mbar relative 
humidity for all ensemble members for each model com- 
pared to their best estimate simulations, ensemble mean, and 
observations. Relative humidity at 500 mbar is chosen, as 
accurate representation of moisture in midlevels is important 
if models are to correctly represent cloud. All models have 
high 500 mbar relative humidity during the active period 
consistent with the observations, but most SCM tend to 
have lower relative humidity than the CRM. The JMA and 
GISS models have particularly low relative humidity which 
is about 10% and 15% lower than the observations, respec- 
tively. The CLUBB and NCEP models have slightly larger 
relative humidity compared to the observations. All models 
have very limited spread during the active period. 

[ 39 ] Observations show that during the transition to the 
suppressed period, humidity reduces to around 60% after the 
passage of the MCS. Relative humidity increases slightly 
before it reduces again from 70% to 30% between days 27 
and 31 (27-31 January). There is a big difference between 
the responses of the SCM and CRM during this period. The 
CRM capture the transition to the suppressed period rea- 
sonably well with relative humidity 10% too low but its 
temporal evolution well captured. SCM generally reduce rel- 
ative humidity too much in the transition period with mean 
values after the transition ranging from 40% (UM) to 10% 
(JMA). An exception to this is the CLUBB model which 
does not excessively reduce relative humidity during the 
transition and is then too moist during the suppressed period. 

[ 40 ] The CRM show limited spread during the active 
period and the passage of the MCS. The spread in both 
model types is largest during the suppressed period. The 
SCM show larger but limited spread in the active period and 
in the transition associated with the MCS. Just like the CRM, 
they show increased spread during the suppressed period. 
This suggests a hypothesis that the simulation of midlevel 
relative humidity may be more sensitive to changes in the 
forcing when the forcing is weak. Furthermore, this sensi- 


tivity results in nonlinearity between the ensemble members 
which is particularly apparent during the suppressed period. 
For example, around 30 January, the SCAMS model shows 
that ensemble members with weaker (stronger) forcing have 
the lowest (highest) relative humidity despite the forcing not 
being the weakest (strongest) forcing. 

[ 41 ] Figure 5 shows that in general the ensemble mean 
and best estimate simulation results follow each other quite 
closely so that their differences from observations are sim- 
ilar. On some limited occasions, the ensemble mean is 
closer to the observations than the best estimate, for exam- 
ple, UM-PC during both the active and suppressed periods 
and CLUBB and SCAMS during the suppressed period. 
To further investigate the ensemble mean to best estimate 
behavior, Figure 6 shows profiles of the difference between 
the best estimate and the ensemble mean relative humid- 
ity for all SCM for the active period. Figure 6 shows that 
when averaged over this period most models have similar 
best estimate and ensemble mean relative humidity. How- 
ever, there are some important exceptions. For example, the 
UM-PC has larger ensemble mean relative humidity than its 
best estimate throughout the depth of the troposphere. This 
larger relative humidity in the ensemble mean represents an 
improvement in the model simulations by bringing the val- 
ues closer to observations. As UM-PC is the only SCM to 
include a stochastic parameterization, this result highlights 
that ensemble simulations are necessary when using models 
with stochastic components. The usefulness of the ensemble 
approach will be investigated further below. 

[ 42 ] When comparing the ensemble simulations with 
observations (Figure 5), it is possible, for some models and 
periods, to determine whether the errors are due to the pre- 
scribed forcing or are models errors. Given that the observed 
forcing spans the range of possible observations, none of the 
JMA ensemble members closely approximate the observed 
relative humidity during the active period. Therefore, this 
model clearly has limitations correctly simulating relative 
humidity during this period. For many models (includ- 
ing SCAM, NCEP, GISS, and JMA), the ensemble shows 
that the transition to the suppressed period is likely to be 
attributable to model error rather than errors in the forcing. 
The GISS model also consistently underestimates relative 
humidity during the suppressed period. 

3.1.2. Precipitation Partitioning 

[ 43 ] An interesting question in the simulation of tropical 
convection is how the various SCMs partition the precip- 
itation between convection and the resolved scale motion. 
Furthennore, given the construction of the ensemble used 
here, it is possible to study how this partitioning changes 
with forcing strength and meteorological situation. Figure 7 
shows the time average convective precipitation fraction 
(CPF), defined as the ratio of convective precipitation to total 
precipitation at the surface, against total precipitation for 
both the active and suppressed periods. Each SCM is shown 
by a color with different symbols used for the different 
models. Each point represents a single ensemble member 
averaged over the period of interest. An increase in total pre- 
cipitation (x axis) indicates an increase in forcing strength. 
The multimodel best estimate ensemble is shown as large 
asterisks. 

[ 44 ] Generally, there is a wide spread in the magnitude 
of CPF between the models ranging from 0.2 to 0.9 in the 
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Figure 5. Time series of RH at 500 mbar for all SCM and CRM. Blue lines show best estimate sim- 
ulations, red lines ensemble mean simulations, and the black line is observations. Gray lines show all 
ensemble members in the range 10-90. Key ensemble members, the 25th, 50th, and 75th percentiles of 
the 80-member large-scale “forcing,” are highlighted as thin black lines which are dash-dotted, solid, and 
dashed, respectively. Note that the CRM do not report best estimate data. 
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Figure 6. Mean period-averaged difference between the 
best estimate and ensemble mean relative humidity for the 
active period for each SCM. 

active period and 0.5 to 1 in the suppressed period. In the 
active period, the models also show a very diverse behavior 
with forcing strength, with some showing an increase in CPF 
(e.g., GISS, UM-GR), some showing a near-constant CPF 
(e.g., NCEP, SCAM), and some showing a decrease (e.g., 
UM-PC). The GFDL2 model shows a somewhat erratic 
behavior. Models of the same type show different behav- 
ior depending on the parameterization scheme used (e.g., 
UM-PC versus UM-GR). 

[45] In the suppressed period, all SCMs have a CPF of 
greater than 50%. There is a tendency in almost all mod- 
els for the CPF to increase with increasing forcing although 
there is much scatter in the relationship. There are two 
groups of models, with either very high or relatively low 
CPF. There is some consistency between the periods, with 
the GISS and UM-PC models showing the lowest CPF 
in both. 

[46] The rather wide spread in model behavior is likely 
indicative of large differences in the assumptions made in the 
different convection treatments on how to partition rainfall 


between convection and the larger scales. As this will likely 
have an impact on the vertical distribution of heating and 
moistening, an important issue for future work is to provide 
observational constraints for the relationships shown here. 
3.1.3. Ensemble Moisture Budget Characteristics 

[ 47 ] The ensemble provides an opportunity to investigate 
the interplay between modeled moisture and the moisture 
budget terms. In particular, this study permits a compari- 
son between how the models control their moisture budgets. 
Given that the models are forced by prescribing horizon- 
tal advection terms and vertical velocity, they independently 
develop moisture budget terms such as vertical advection 
terms and moisture convergence in addition to the moisture 
contributions from parametrized processes such as convec- 
tion and surface evaporation. This is an important difference 
between this study and previous intercomparisons [e.g., 
Woolnough et al., 2010; Guichard et al., 2004] where the 
total moisture forcing was prescribed. Furthermore, given 
that this study also includes both best estimate and ensemble 
simulations, comparison can be made about the additional 
model characteristics exposed using an ensemble compared 
to a single best estimate simulation. 

[48] Figure 8 shows time average precipitable water 
against various terms in the moisture budget for the active 
period for all models and ensembles in this study. Very 
similar results are obtained for the suppressed period (not 
shown). Figure 8a shows that during the active period, the 
SCMs tend to divide into models in which lower precip- 
itable water is associated with larger precipitation (GISS and 
SCAM), models where precipitable water is higher for larger 
values of precipitation (UM and CLUBB), and those models, 
including CRM, where precipitation is independent of pre- 
cipitable water. The GFDL model is somewhat an exception 
as its relationship shows significant scatter. 

[ 49 ] The largest term in the moisture budget is the mois- 
ture convergence term which is shown in Figure 8b. In 
all models, the moisture convergence term shows a similar 
magnitude and characteristics to precipitation which is not 
surprising as it is the largest source of moisture for the grid 
box exceeding surface evaporation by an order of magnitude 
(see below). Furthermore, Figure 8b shows that the moisture 
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Figure 7. Time-averaged scatter plots of surface precipitation against convective precipitation (shown 
as a fraction of the total surface precipitation) over the active (left) and suppressed (right) periods for 
ensemble members 10-90. Each model type is represented by a color and each model of a given type by 
a symbol. The multimodel best estimate ensemble is represented by a large asterisk. The CLUBB model 
and CRM do not submit partitioned precipitation data. 
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Figure 8. Time-averaged scatter plots of PW against (a) 
precipitation, (b) moisture convergence, and (c) surface 
evaporation over the active period for ensemble members 
10-90. Each model type is represented by a color and each 
model of a given type by a symbol. The CRM are rep- 
resented by large open symbols and the multimodel best 
estimate ensemble by a large asterisk. 


convergence acts as feedback mechanism where SCM with 
larger values of precipitable water enhance moisture sup- 
ply and produce more precipitation. Other models, despite 
the strong forcing, have lower precipitable water and lower 
moisture convergence. Petch et al. (submitted manuscript, 
2012) discusses a likely reason by investigating the method 
used to force the SCM compared to the method used to force 
the CRM as used in Fridlind et al. [2012], It was found that 
given a positive moisture bias, convergence (which occurs 
during the active period) increases that positive bias, and 
similarly convergence enhances a negative moisture bias. 
Models forced by prescribing the total moisture forcing, as 
used in Fridlind et al. [2012], do not develop these biases. 
The ensemble results shown in Figure 8b support the End- 
ings of Petch et al. (submitted manuscript, 2012). This model 
response to bias is not, however, apparent when only the 
best estimate simulations are considered. GISS and SCAM 
both have a drier atmosphere during the active period com- 
pared to the observations and other SCM which result in 
reduced precipitation compared to those SCM with a moister 
atmosphere. 

[ 50 ] Another important term in the moisture budget is 
surface evaporation. Figure 8c shows this tenn for each 
model and ensemble member as before. Note that the surface 
evaporation term is an order of magnitude smaller than the 
moisture convergence contribution. It is evident that there 
is a fundamentally different relationship between forcing 
strength and evaporation in the SCMs and the CRMs indicat- 
ing differences in the physical mechanisms at work in these 
two classes of models. All SCMs approximate a quasi-linear 
relationship of evaporation to precipitable water, albeit of 
varying strength, with larger surface evaporation at lower 
values of precipitable water and lower surface evaporation 
when precipitable water is high. This is consistent with the 
fonnulation of the SCMs as, given that low level winds 
and SST are prescribed in all models, evaporation can only 
change in response to atmospheric moisture. The CRMs on 
the other hand show a very different response to changes 
in the forcing. Here, the values of evaporation are inde- 
pendent of precipitable water. This indicates the importance 
of small-scale wind variability in driving surface evapora- 
tion. In the SCMs, this variability is not resolved. Unless 
it is parametrized, SCM surface fluxes are determined by 
the mean wind alone. In the CRMs, this wind variability is 
resolved and hence will enhance the surface fluxes. From 
the results, it is evident that the SCMs do not deal effec- 
tively with the subgrid variability. This result highlights the 
usefulness of the ensemble approach as this “error” in the 
SCMs would not have been evident from a set of single best 
estimate simulations. 

[ 51 ] By using an ensemble approach, several interesting 
conclusions about model performance as well as simula- 
tion setup could be drawn. Given that strong precipitation 
in the models (and in nature) is strongly linked to moisture 
convergence, this exposes some interesting model behav- 
ior. By design of the simulations, moisture convergence 
is calculated by the models. Consequently, those mod- 
els that develop a dry bias cannot develop large moisture 
convergence and do not produce as much precipitation, 
with the opposite effect occurring in models with a moist 
bias. The SCMs require a drier atmosphere to develop 
stronger surface evaporation. In contrast, the CRMs develop 
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Figure 9. Mean period-averaged cloud fraction (left) for the active period (top) and suppressed period 
(bottom) for each model. For a limited number of models, the right-hand panels show the period mean 
together with ensemble members. Colored lines show the average for each model and gray lines ensemble 
members in the range 10-90. (Note that the CLUBB model does not include ice in cloud fraction and the 
LEM includes rain in cloud fraction.) 


evaporation changes independent of atmospheric moisture 
likely due to the development of subgrid scale wind variabil- 
ity not present in the SCMs. 

3.2. Clouds 

[ 52 ] This section investigates the simulation of cloud- 
related variables in the CRMs and SCMs. Initially, the 
vertical structure of liquid water and ice clouds are discussed 
in both the active and suppressed monsoon. Following on 
from this, once again use of the ensemble will be made 
to investigate relationships between cloud-related variables 
as the forcing strength changes. This will expose several 
interesting characteristics of the various model parameteri- 
zations. 

3.2.1. Profiles of Cloud Properties 

[ 53 ] Figure 9 shows vertical profiles of the ensem- 
ble mean model cloud fraction for all models during 
the active (top left) and suppressed (bottom left) period 
as well as selected examples of the full ensemble from 
three models for the active (top right) and suppressed 
(bottom right) periods. Cloud fractions generally reflect the 
meteorological conditions shown in Figure 4 with cloud 
throughout the troposphere during the more moist, active 
period and two cloud layers during the suppressed period 
which are low cloud between 950-750 hPa, and high ice 
cloud above 200 hPa. 


[ 54 ] During the active period, there are large differences 
in CRM cloud fraction of around 30% at all levels, and 
the SCMs mostly fall within the range of the CRMs. This 
can largely be explained by the definition of cloud fraction, 
which in the LEM includes both cloud and precipitating 
hydrometeors, while in the SAM model it only includes 
cloud water and ice. All SCMs have cloud fraction less than 
30% below 600 hPa and more cloud (with the exception of 
JMA) above. There are large differences between the mod- 
els with slightly better agreement in lower levels than in the 
upper troposphere. 

[ 55 ] The differences in cloud fraction in the SCMs are also 
large in the suppressed period. One noticeable feature of the 
selected full ensembles (right panels) is that the difference 
of individual ensemble members from their mean tends to 
be smaller than the differences between models. This indi- 
cates that the differences in the simulated cloud structures 
are dominated by the structural properties of the models, not 
by the forcing data set, and shows that model representation 
of cloud is liable to error independent of the meteorologi- 
cal conditions. Best estimate simulations are therefore likely 
sufficient to expose model differences in this variable. This 
is investigated in Figure 10, which shows the differences 
of profiles of cloud cover between the ensemble mean and 
the best estimate simulation. As for relative humidity, most 
models show only small differences although with notable 
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Figure 10. Mean period-averaged difference between the 
best estimate and ensemble mean cloud fraction for the 
active period for each SCM. 

exceptions, the UM-PC around 400 hPa and GFDL below 
700 hPa. 

[56] Figure 11 shows profiles of ice water content in all 
models for the active period. Again, the ensemble means 
for all models are shown in the left panel, while selected 
full ensembles are shown in the right panel. The suppressed 
period is omitted from this Figure as the ice cloud during 
this period is not linked to local convection and is not well 
simulated. There are large differences between ice water 
content in both the CRMs and the SCMs during the active 
period which will impact on the model radiation budgets. 
Modeled ice water content differs in terms of both mag- 
nitude and vertical structure. Differences in the structural 
properties can again be noted in modeled ice water content 
with each SCM clustering around its ensemble mean. Diffi- 
culty in representing ice microphysics has been noted in all 
other TWP-ICE intercomparison studies and has been unan- 
imously suggested as a focus for future model development. 

[57] This section has shown that there are substan- 
tial differences in the vertical structure of parameterized 
cloud variables which may be attributable to systematic 



differences in the representation of clouds between the mod- 
els. Structure in the cloud variables is clearly identifiable 
using the ensemble in both the active and suppressed peri- 
ods. These persistent structures show that the models are not 
sensitive to changes in the forcing and that for most mod- 
els best estimate simulations are likely sufficient to expose 
the mean model behavior in both periods. It is clear from the 
large differences between them that the CRMs only provide 
a limited estimate of the truth, especially during the sup- 
pressed period, as their representations of clouds are limited 
themselves [. Fridlind et al., 2012]. 

3.2.2. Ensemble Cloud Characteristics 

[58] While the previous section showed that it is likely 
that the mean cloud properties of each model can be exposed 
by a single best estimate simulation, the full ensemble 
results provide a useful tool to investigate how relation- 
ships between variables might change within each model 
as the forcing varies across ensemble members. Represent- 
ing the correct relationships between variables is a greater 
challenge for models than representing means, but it is also 
a necessary condition for applying the models over a wide 
range of conditions, such as a full GCM. This subsection 
will investigate how the ensemble developed here can be 
used to investigate relationships between different variables. 
Each ensemble member, experiencing different forcing data, 
can be considered as a separate test case, albeit spaced in 
controlled manner from all other ensemble members. 

[59] Figure 12 shows the mean liquid water path (LWP) 
as a function of the mean surface precipitation averaged over 
the active (left) and suppressed (right) periods for all mod- 
els. Each symbol represents an individual ensemble member. 
While there are generally different relationships between the 
two periods (note the change in scale between periods in 
the Figure), the CRMs show that relationship between LWP 
and precipitation is linear (with a gradient of approximately 
250 kg m 3 h in both the active and suppressed period). 
The CRMs agree very well during the suppressed period 
but differ at the larger precipitation rates during the active 
period. 

[ 60 ] Most, but not all, SCMs also produce a linear rela- 
tionship between LWP and surface precipitation. Notable 
exceptions are the GFDL, SCAM, and JMA models. The 
relationships in the SCMs differ somewhat between the 




Figure 11. Mean period- averaged ice water content for the active period (left) for each model type. For a 
limited number of models, the right-hand panels show the period mean together with ensemble members. 
Colored lines show the average for each model and gray lines ensemble members in the range 10-90. 
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Figure 12. Time-averaged scatter plots of surface precipitation against liquid water path over the active 
(left) and suppressed (right) periods for ensemble members 10-90. Each model type is represented by a 
color and each model of a given type by a symbol. The CRM are represented by large open symbols and 
the multimodel best estimate ensemble by a large asterisk. 


active and suppressed periods with a tendency for mod- 
els to have tighter and more linear relationships during the 
active period. In the suppressed period when precipitation 
is small, both the UM and NCEP models tend to have pre- 
cipitation independent of LWP, which itself is at an almost 
constant value. The GFDL and SCAM models tend to dis- 
play significant scatter in LWP with only a weak relationship 
to precipitation. In fact, only the CLUBB, GISS, and JMA 
models increase LWP with precipitation as the CRMs sug- 
gest during the suppressed period. A linear relationship was 
observed between LWP and precipitation in Fridlind et al. 
[ 2012 ], 

[6 1 ] The CRMs tend to lie in the middle of the SCM 
distribution, suggesting that the SCM ensemble mean may 
approximate the correct values of LWP, although individual 
models may differ quite considerably from the CRMs. The 
UM and NCEP models are biased low at all times, whereas 
the GISS and one of the JMA models have a LWP that is too 
large during the suppressed period. Unlike for cloud frac- 
tion before, the best estimate simulations do not always fall 


close to the center of the ensemble (note the large asterisks 
for GFDL and one JMA model to their associated ensem- 
ble during the suppressed period). The ensemble results also 
expose interesting nonlinearities in some of the models. For 
instance, there is a discontinuity in LWP in the GFDL around 
0.15 kg m 2 during the active period. This possibly relates 
to the discontinuity in the convective precipitation fraction 
in Figure 7. While magnitude differences are apparent in 
the multimodel ensemble, the relationships between LWP 
and precipitation are only found in the full ensemble show- 
ing a potential usefulness of an ensemble technique when 
identifying model behavior. 

[ 62 ] Figure 13a shows the relationship between IWP and 
precipitation during the active period. It can be seen that 
similar to LWP, IWP generally has a linear relationship 
with precipitation. Unlike the relationship of precipitation 
with LWP, the one with IWP is not consistent between the 
CRMs. There are very different magnitudes of IWP in the 
CRMs, and the slope of the relationship to precipitation 
varies strongly as well. Large differences in the simulation of 




Figure 13. Time-averaged scatter plots of surface precipitation against ice water path (left) and liquid 
water path against ice water path (right) over the active period for ensemble members 10-90. Each model 
type is represented by a color and each model of a given type by a symbol. The CRM are represented by 
large open symbols and the multimodel best estimate ensemble by a large asterisk. Note: GISS model is 
not shown in these plots. GISS IWP exceeds all other models by a factor of 2. 
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Figure 14. Location of rain gauges used in this study in 
and around the pentagon-shaped TWP-ICE domain. Com- 
parison of radar-derived rainfall data to gauge rainfall data 
is conducted at all stations. Representative results are shown 
for Batchelor station (YBCR, 1 31. 0252 W, 13.0545S) and 
Charles Point (CHAP, 130.6309W, 12.389S). The colored 
area around each gauge shows the region of the TWP-ICE 
domain area closest to that gauge. 

IWP in CRMs have been identified in other studies [Fridlind 
et al., 2012]. The existence of those discrepancies makes it 
difficult to use the CRM results in assessing the SCM behav- 
ior. The LEM has approximately double IWP compared to 
SAM with a gradient of 300 kg nr 3 h (LEM) compared 
to 50 kg nr 3 h in SAM. Most SCM have gradients around 
this range, although in the NCEP model IWP is relatively 
insensitive to forcing. 

[ 63 ] The ensemble enables the comparison not only 
between models but also of different versions of the same 
model. For example, SCAMS and SCAMR have very 


similar IWP, whereas SCAML, using a different micro- 
physics scheme, has twice the IWP of the other SCAM 
models. There is a more marked difference between the two 
versions of the UM. UM-PC follows closely the gradient 
and approximate magnitude of the LEM (which is the UK 
Met Office’s CRM) and which was used in the formulation 
of the Plant and Craig [2008] stochastic convection param- 
eterization scheme. The UM-GR, on the other hand, is close 
to the SAM CRM which shows that there is complex inter- 
play between the parameterization schemes. The UM SCM 
only differ in their convection parameterization, but this has 
a large effect on the IWP produced. In general, there is a 
split between models that follow the strong slope of the LEM 
and those closer to the weaker slope of the SAM. It is not 
possible, however, to attribute the relationship between pre- 
cipitation and IWP simply based on the model microphysics 
scheme. 

[ 64 ] Figure 13b shows the relationship between LWP 
and IWP and shows different aspects of the relationships 
between the variables in the models. There is a clear spilt 
between some models that have larger ranges in LWP (e.g., 
SCAM and JMA models) and others that have larger ranges 
in IWP (e.g., UM-PC and CLUBB). Fridlind et al. [2012] 
found that 2-D CRMs have a weaker relationship than 3- 
D CRMs between IWP and LWP, which is contrary to 
Figure 13a. However, the 2-D version of the LEM used here 
was not part of the Fridlind et al. [2012] study, and fur- 
thermore, the SAM here used a single -moment microphysics 
scheme, whereas the SAM in Fridlind et al. [2012] used a 
double-moment scheme [Morrison et al. , 2009] so a direct 
comparison is not possible. 

[ 65 ] Interestingly, considering only the multimodel 
ensemble (Figure 13b) shows a different relationship 
between LWP and IWP compared to the relationship shown 
in the individual ensemble simulations. The ensemble 
within each model suggests increasing IWP with LWP, 
whereas the multimodel ensemble would suggest a tendency 
for IWP to increase with decreasing LWP. This shows the 
differences and potential limitations of using a multimodel 


YBCR CHAP 



R^R, 


Figure 15. Distributions of radar-derived rainfall normalized by rain gauge rainfall for two rain gauges 
for TWP-ICE. A log-normal fit is shown in the solid line. Statistics of the observed data and the fit data 
are given in the top right corner of each panel. 
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Figure 16. Ensemble cumulative rainfall time series for TWP-ICE derived from error estimates in radar- 
derived rainfall. Broken and light-colored lines show all ensemble members with key ensemble members 
(5th, 25th, 50th, 75th, and 95th percentiles) as black continuous lines. The best estimate forcing, used 
here and in the CRM intercomparison, is shown by small circles. 


ensemble. Using a multimodel ensemble would suggest the 
reverse characteristic relationship between variables to that 
suggested by CRM and SCM each simulating then own 
ensemble. 

4. Summary and Discussion 

[66] This study presents an ensemble of SCM and CRM 
simulations for the TWP-ICE period. The first purpose of 
the study was to derive an ensemble of model forcings based 
on observational uncertainty. This data set was then applied 
to a variety of models to assess what new information about 
model behavior and model error might be gleaned from an 
ensemble approach that could not be attained by a single 
realization commonly used in CRM and SCM studies. It was 
found that the overall model behavior in terms of the time 
evolution of thermodynamic variables or the time-averaged 
vertical structure of those variables generally changes lit- 
tle between the ensemble mean and a single “best estimate” 
simulation. However, there were some notable exceptions 
to that finding. In some model simulations, like those with 
the UM-PC, ensemble means deviate from the best esti- 
mate simulations throughout the troposphere. Given that the 
ensemble mean forcing is close to that of the best esti- 
mate, this indicates nonlinearities in the simulation behavior 
possibly due to the stochastic component of the model. 
The ensemble also shows that models have greater sen- 
sitivity when weakly forced, and therefore, an ensemble 
is necessary. Perhaps the main value the ensemble adds 
to single simulations is the possibility to investigate the 
changes in model behavior with changes in forcing. This 
has proved invaluable in highlighting several aspects of 
model behavior in this study, namely, (i) a distinctly different 
behavior in the SCMs from that in the CRMs in achiev- 
ing changes in surface evaporation; (ii) the sensitivity to 
the particular forcing method applied, (iii) a wide spread 
in the convective precipitation fraction in models and its 
sensitivity to forcing strength, and (iv) distinctly different 
model behavior in the relationships between cloud variables 
and precipitation. 

[ 67 ] Examining the terms of the moisture budget using 
the ensemble enabled interesting conclusions about model 
behavior for two important terms; the surface evaporation 


and the moisture convergence. A clear distinction exists 
between the CRMs and the SCMs. In the CRMs, evapora- 
tion increases for constant atmospheric moisture, whereas 
the SCMs can only increase evaporation by drying the atmo- 
sphere. This suggests a role of subgrid variability likely 
brought about by cold pools in the CRMs that is not param- 
eterized in SCMs. A representation of cold pool dynamics 
in SCMs would allow surface evaporation to occur in a 
moist atmosphere. Studying the moisture convergence term 
as a function of forcing strength revealed an interesting 
feedback between model error and the particular forcing 
approach chosen here. As the models are forced with hori- 
zontal moisture advection and vertical motion profiles (and 
hence profiles of mass convergence and divergence), they 
develop their own vertical moisture advection and moisture 
convergence terms. In models that develop a moist/dry bias, 
this bias is reinforced by an increase/decrease of the mois- 
ture convergence into the region. This behavior limitation 
can easily be deduced using the ensemble approach, while 
it would go largely unnoticed in single simulations with a 
number of models. 

[68] The ensemble was also shown to be useful in inves- 
tigating cloud variables and their relationships. Ensemble 
vertical profiles generally highlight structural differences 
between different models in that all ensemble members 
of a particular model tend to lie closer to its mean than 
to that of other models, even with large variations in the 
forcing. Consistent with the results in the accompanying 
modeling studies for TWP-ICE [Lin et al, 2012; Fridlind 
et al., 2012; Zhu et al., 2012], large differences are found in 
the models’ simulation of cloud ice, highlighting this area 
once again as one warranting further study. The ensemble 
is used to identify relationships between liquid water, cloud 
ice, and precipitation. CRM simulations, while varying in 
magnitude, show clear linear relationships between those 
variables. This behavior is not reproduced in all SCMs, some 
of which show strongly nonlinear behavior or even jumps. 
The ensemble also reveals that the ice water path to liquid 
water path relationships are very different between models, 
with one group of models showing a very strong increase 
of IWP with LWP, while in others 1WP is almost indepen- 
dent of LWP. This conclusion applies to both CRMs and 
SCMs. Using the multimodel best estimate ensemble only, 
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the important relationship of increasing ice water path with 
liquid water path in individual models is reversed. 

[69] This study shows that the introduction of an ensem- 
ble to a modeling study provides more information than 
might be gathered by simulating only simple best estimate 
forcing. While the method does not replace the standard 
best estimate approach to single-column modeling, it com- 
plements it by (i) providing an easy framework to study 
model sensitivities and (ii) increasing confidence in detect- 
ing model behavior that is likely due to model, rather than 
forcing, limitations. Future SCM studies should therefore 
consider adding ensemble simulations in addition to, rather 
than instead of, the more conventional best estimate method. 
Despite the additional information provided by the ensem- 
ble, it remains difficult to conclusively link model behavior 
in an SCM to parameterization assumptions, highlighting the 
need to embed studies like the one presented here into a 
larger framework of model evaluation. 

Appendix A: Derivation of the Large-Scale 
Forcing Ensemble 

[ 70 ] An important part of this study is the use of an 
ensemble of large-scale forcing data sets. The motivation 
for doing so is to assess the inherent uncertainty in deriv- 
ing a single best estimate of the large-scale atmosphere 
from observations and in its subsequent application to drive 
model simulations. This appendix describes the construction 
of the ensemble used in this study, which is based on two 
steps: (i) estimate errors in the estimate of area-mean rain- 
fall and construct alternative rainfall scenarios and (ii) apply 
a constrained variational analysis to each of the rainfall sce- 
narios derived in the first step to yield the final ensemble of 
large-scale atmospheric states. 

Al. Deriving an Ensemble of Rainfall Estimates 

[ 71 ] The main source area-mean rainfall information in 
this and other TWP-ICE studies [e.g., Xie et al., 2010] are 
rainfall estimates from a C-band polarimetric radar located 
near Darwin [Keenan et al., 1998]. The algorithm used 
to estimate rainfall from radar variables is that of Bringi 
and Chandrasekar [2001]. While the radar provides excel- 
lent spatial coverage to estimate area means, deriving rain 
rates from radar variables will lead to errors in the rainfall 
estimates. A first step in the ensemble construction is to esti- 
mate these errors. To do so, we use rain gauge observations 
around Darwin and apply a method very similar to that of 
Jordan et al. [2003]. 

[ 72 ] Radar rain rates vary in space and time, and radar 
errors may vary considerably based on location and tim- 
ing of rain events. The array of rain gauge data shown in 
Figure 14 is used as a reference for the radar-derived rain- 
fall data. A grid of 3 x 3 radar pixels (approximately 9 km 2 ) 
are averaged and compared to rain gauge measurements over 
an accumulated period of 180 min where both rain rates are 
greater than 1 mm. By performing this analysis at many loca- 
tions over the TWP-ICE domain, it is anticipated that the 
differing sources of error may be better accounted for. 

[ 73 ] Examples of the ratio of radar-derived rainfall data 
to rain gauge rainfall data are shown in Figure 15 for two 
rain gauges. Assuming that rain gauge data may be a better 
estimate of rainfall than radar-derived data, ratios close to 1 


suggest small errors in the radar data, with smaller standard 
deviations showing the clustering of the errors. The statistics 
in Figure 15 for the observed data show differences in the 
mean values and standard deviations at the two locations, 
suggesting that indeed errors have different spatial patterns. 
As the data tend to cluster about 1, the two observed data sets 
predominantly agree on the magnitude of rainfall, although 
the long tails of the error distribution show that on occasions 
large errors can be identified. 

[ 74 ] A log-normal distribution is fitted to the errors shown 
in Figure 15. The log-normal distribution parameters are 
estimated and used to construct an ensemble of rain rates 
at each radar pixel as follows. The distribution of radar to 
gauge rainfall ratios is divided into 100 percentiles. Then the 
ratio for each percentile is used to multiply the radar rain val- 
ues, providing 100 rainfall values (one for each percentile) at 
each radar pixel. For each radar pixel, the error distribution 
derived at the nearest rain gauge is used. Figure 14 shows the 
areas (colored) for which error characteristics are assumed 
constant in space based on the nearest rain gauge behavior. 

[ 75 ] Flaving derived rainfall error estimates at each radar 
pixel, which is expressed as 100 values of rainfall from the 
lowest to the highest, the next task is to estimate the error 
in the area-mean rainfall. This requires assumptions about 
the spatial correlation of the individual pixel errors. As our 
goal is to span the widest range of possibilities, we will 
assume the worst case scenario of maximum correlation. 
In other words, we assume that whenever the largest pos- 
sible error occurs at 1 pixel, the largest error in the same 
direction occurs at all radar pixels. This is an extremely 
simple assumption and will maximize the possible error in 
the area-mean rainfall, consistent with our goal to maxi- 
mize ensemble spread. Using this assumption, 100 values of 
area-mean rainfall are derived by simply averaging the pixel- 
rainfall rates within each percentile, i.e., the first percentile 
of the area-mean rainfall distribution is simply the average 
of all first-percentile values at each pixel and so on stepping 
through all percentiles. Figure 16 shows the 100 cumulative 
rainfall time series in this way for TWP-ICE. For compari- 
son, the figure includes the best estimate rainfall time series 
as derived by Xie et al. [2010], which falls close to the 50th 
percentile as might be anticipated from the method the dis- 
tribution was constructed. While the error estimates allow 
for a large range of possible rainfall values, 50% of the dis- 
tribution falls between the 25th and 75th percentiles of the 
distribution which has a limited range of rainfall. 

A2. Deriving the Large-Scale Atmospheric State 

[ 76 ] Each of the 100 rainfall scenarios derived above 
is used separately in the variational analysis algorithm of 
Zhang et al. [2001] (all other observations, such as thermo- 
dynamic variables, horizontal winds, and radiation terms, 
are unchanged and are the same for each scenario) to pro- 
duce 100 separate forcings that are all equally possible given 
the uncertainty in area- mean rainfall. The higher (lower) 
percentile corresponds to stronger (weaker) surface pre- 
cipitation and generally stronger (weaker) vertical motion. 
The characteristics of the vertical motion for the active and 
suppressed periods are discussed in the main text. 

[ 77 ] Investigations were made into whether the additional 
variational analysis inputs should be modified in order to 
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be more physically consistent. For example, an estimate 
of rainfall error has been used to derive alternative rain- 
fall time series, but increased rainfall may, in the simplest 
terms, also be associated with more deep cloud and therefore 
reduced top-of-the-atmosphere longwave radiation, which is 
also an input to the variational analysis. Sensitivity studies 
where the radiation was varied in conjunction with rainfall 
had little impact on the resulting large-scale atmosphere. 
This supports Zhang et al. [2001], who suggested that rain- 
fall provided the largest contribution term in the variational 
analysis. 

[ 78 ] The 100 large-scale data sets so derived are used to 
provide forcing data for SCM and CRM as described in the 
main text. 
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